Giga Games 1

home *** CD-ROM | disk | FTP | other *** search

/ Giga Games 1 / Giga Games.iso / net / vir_real / faq / rsrch / nasa2 < prev next >

Wrap

Internet Message Format | 1992-03-22 | 29.6 KB

From rick@eos.arc.nasa.gov Wed Nov 28 18:58:04 1990 Received: from eos.arc.nasa.gov by milton.u.washington.edu (5.61/UW-NDC Revision: 2.1 ) id AA02605; Wed, 28 Nov 90 18:57:57 -0800 Received: Wed, 28 Nov 90 18:58:26 -0800 by eos.arc.nasa.gov (5.65/1.2) Date: Wed, 28 Nov 90 18:58:26 -0800 From: Rick Jacoby <rick@eos.arc.nasa.gov> Message-Id: <9011290258.AA20410@eos.arc.nasa.gov> To: madsax@u.washington.edu Subject: Re: Sci.virtual-worlds Status: RO Software Development Overview 1 Software Development Overview Rick Jacoby, Steve Bryson Sterling Software, 3/31/90 Table of Contents Introduction Overview of Hardware and Software System Diagram Host Computers Graphics Systems I/O Device Communication Trackers Gloves Audio Systems Viewing Stations Viewer Electronics Gimbal Platform Video Editing Workstation Software Simulation Framework Directory Structure Libvived Commands Calfiles Graphics and GOSAMR Interfaces and Conventions Standard VIEW Options Speech and Sound Windows Menus Gestures Data Representations Input Glove Tracker Speech Recognition Output Window Speech Synthesis Audio Cues Documentation Designing a VIEW Application Introduction This document is an overview of the VIEW software development environment. It is intended for people who will be writing software for the virtual environment. It gives both "hard" information, like directory structures and program names, and "soft" information, like our development philosophies. The VIEW (Virtual Interactive Environment Workstation) system is a set of computer hardware and software and computer-controlled i/o sub- systems. The purpose of the system is to place the user in an artificial, or virtual, reality. Although it is an artificial reality, it may be an enhanced reality, with displays or other features not found in ordinary reality. There is a graphics system that presents pre-defined, solid shaded objects around the user. There is an audio system that presents synthesized speech, and modulated or constant tones that can be located in space. The user's head position is tracked and is used to control the viewpoint of the graphics system. The user's hand position is tracked and may be used by the system. The position of his fingers is also tracked and can be used to signal the computer through gestures. The user can also give voice commands to the system. This document is an overview and the reader should refer to the man pages and other documentation for a more complete description of each item (see Documentation section below). Overview of Hardware and Sub-systems Below is a schematic diagram of most of the VIEW hardware. Absent from the diagram is the boom and its viewer, the gimbal and its cameras, and the video editing/mixing workstation. The virtual environment hardware consists of a host computer and many i/o sub-systems. The following is an introduction to computers and sub-systems. Host Computers: The host computer is an HP 9000/835 and has the network node name "Sim". Sim runs HP-UX, HP's version of UNIX. Sim communicates with the sub-systems in several ways; the HP backplane, parallel communication (AFI card), and serially (RS-232). The serial communication can be done in two ways; though a MUX card or through an Real-Time Interface (RTI) card. Graphics Systems: There are two graphics systems in the VIEW environment; the ISG Technology's graphics system (ISG) and the HP Turbo SRX graphics system (SRX). The ISG is actually a stand-alone computer (Motorola 68020 based) running UNIX, and has the network node name "Stim". It has two drawing channels (one for each eye). Each channel has a drawing engine (DEU/consisting of 16 TI 320c25's), a display processor (DPU/bit-slice processor), and a frame buffer. The frame buffer in use is 640 x 480 (although there is a 1K x 1K mode that has not been tested). Sim and Stim communicate over a parallel channel (AFI on Sim/DR-11W on Stim). There are two SRXs on Sim's bus; one for each eye. The SRX frame buffers are 1280 x 1024. The ISG graphics are faster, but lower resolutionQthe SRX graphics are higher resolution, but slower. I/O Device Communication: All other i/o devices communicate with Sim in a serial manner. Both the MUX cards and the RTI cards sit on the HP backplane. The standard MUX interface is probably the easiest to code, however the HP's MUX card is not designed as a real-time interface (the ports are polled and this limits a scenario's frame rate to about 11 Hz). The two RTI cards are like attached co-processors (they are based on an Intel 80c186 with 512 K- bytes of memory and run PSOS). Each RTI has eight serial ports with which to communicate with i/o devices. Programs are down-loaded from Sim to the RTIQone for each device with which to communicate, and a main program to gather the device data and send it to Sim. Trackers: There are three different six degree of freedom tracking devices; a Polhemus tracker with a source and two sensors, an Ascension tracker with a source and three sensors, and boom. Both the Polhemus and the Ascension are magnetic trackers; they work by having the source emit e-m waves, and coils in the sensors receive the waves. The devices then compute and return the position and orientation of the sensors relative to the source. With two sensors, the Polhemus operates at 18 Hz and is fairly accurate up to 4 feet from its source. With three sensors, the Ascension operates at 30 Hz and is fairly accurate up to 6 feet from its source. They are connected to one of the RTI cards and they are used for tracking head and hand (glove) movement. The boom is used to support, and track the movement of, the boom- mounted viewer. There are six potentiometers on the boom. The potentiometers analog values are converted to digital values on an STD computer and the sent to an IBM AT which computes the position and orientation of the viewer relative to the boom base. The AT can communicate with Sim via RS-232. Gloves: There are three VPL glove devices, each capable of reading one glove; a Macintosh with glove software, and two stand-alone glove boxes. All three are connected to an RTI. The amount of finger joint flexion is returned to Sim where the gesture software determines which of the eleven gestures is being made. Audio Systems: The audio system consists of two MIDI driven Ensoniq (ESQ-M) synthesizers, a Hinton box (RS-232 to MIDI converter), a Dectalk speech synthesizer, a convolvotron, headphone and headphone amp, speakers and speaker amp, and a audio mixer/patch bay. Audio output is either audio cues (discrete or continuous tones) or speech strings. The output is played through room speakers and/or headphones. Audio cues can be convolved (located in 3-space) or un-convolved. Un-convolved cues are routed to the second synthesizer and then to the mixer. Convolved cues are routed to the first synthesizer and then to the convolvotron. The convolvotron output should be routed to the mixer, but due to a convolvotron/mixer signal matching problem, the convolvotron is currently patched around the mixer and goes directly to the headphones. The system is capable of playing 8 un-convolved cues and 2 independently convolved cues at one time. Speech Recognition: Speech recognition input is performed by a VocaLink speech recognizer. It is connected to an IBM PC which is connected to a MUX port. The IBM is needed because it runs some VocaLink supplied software (the source to which costs $7-8 K). The VocaLink is capable of understanding connected speech, however we are currently using it in a discrete time mode. Viewing Stations: There are two viewing stations for the virtual environment; the helmet and the boom-mounted viewer. The helmet viewer consists of two back-lit, monochromatic, liquid crystal displays with 320 x 220 resolution and diamond shaped pixels. (Due to the pixel shape, there is an effective higher resolution of 640 x 220) The color of the display is determined by the color of the back lighting. Tracking the head position and orientation is done with one of the magnetic trackers. The helmet also has earphones (for audio output), and microphone (for speech input). The boom-mounted viewer consists of two black and white CRT's with 400 x 400 resolution. Tracking is done with the boom output (see above). Viewer Electronics: The output of the SRX graphics is an RGB signal. To make it useful for the viewer, it is passed through the a scan converter where it is converted to NTSC. The output of the ISG is already NTSC. The NTSC signal is sent to an electronics box (e-box) where it conditioned for the viewer (sync signal changed, image positioned, etc.). The final signal is sent from the e-box to the viewer. Gimbal Platform: The gimbal platform is a remotely operated three degree of freedom pointing device. It has two video cameras mounted on the gimbal for stereo video input into the VIEW environment. The gimbal is connected to an IBM AT (the same AT to which the boom is connected). Sending orientation data from the boom to the gimbal can then control where the cameras are looking. The Polhemus can also be connected to the AT so the helmet's orientation can be made to control the gimbal's orientation. Video Editing Workstation: Output from both the audio mixer and the viewer electronics boxes is routed in parallel to the user (viewer, earphones, and speakers) and to the video editing and mixing workstation. Routing to the workstation allows video documentation of research, demonstrations, and of the system. See the Documentation section below for details on where to look for more information on most of these topics. Software Simulation Framework A skeletal application, skel, was developed as an implementation of the software simulation framework. Skel is a simple program that uses most of the VIEW i/o devices and many of the VIEW library functions. The skeletal application (or simulation framework) has two purposes: it can be a starting point for new application programs, and it is an example of how to access many of the VIEW devices. Skel is a running program and code could be added and modified so that a new application is developed from this software 'framework'. Skel uses the viewio i/o system (RTI), gloves and tracker input, gesture recognition, speech input, audio output (convolved and un- convolved), speech synthesizer output, and graphics output using .g files, windows, menus, and text. Skel uses standard view options (SVO), so the user can select graphics output device, glove and tracker input devices, and user calibration files, all at run time. For more information see the SVO man page and the section of this document "Designing a VIEW Application". Directory Structure Most of the VIEW developed software is kept in directories under /usr/vived directory. The major categories of sub-directories are: source files, include files, executable utilities, VIEW library, user data, demos, and experimental work. Most of the source files are backed-up in the same directory as RCS files (,v files). The following is a list of most of the directories and there intended use. /usr/vived/src/lib/libvived .c files for libvived.a /usr/vived/src/lib/libvived/gosamr .c files for GOSAMR portion of libvived.a /usr/vived/src/lib/cal device dependent source for cal file creation /usr/vived/src/lib/ins source for install script /usr/vived/src/cmd source for utilities /usr/vived/src/cmd/mkcal source for mkcal /usr/vived/src/include source for .h files used in libvived.a and application programs /usr/vived/src/man sub-directories for sources for man pages /usr/vived/src/misc sub-directories for interesting and useful non-standard code /usr/vived/src/misc/external-eye window.c that creates one external viewpoint and one internal viewpoint in an application /usr/vived/src/rti/blocking source for RTI code where rtimain does a blocking write /usr/vived/src/rti/non-blocking source for RTI code where rtimain does a non-blocking write /usr/vived/src/rtihost source for host side of RTI /usr/vived/bin executable utilities /usr/vived/exp sub-directories of user code under development /usr/vived/include installed .h files used in libvived.a and application programs /usr/vived/lib installed libvived.a, installed cal files, installed install script /usr/vived/lib/userdata sub-directories for each users glove and convolvotron calibration files /usr/vived/lib/gdata GOSAMR file for the installed hand /usr/vived/lib/rti executable RTI code /usr/vived/lib/rtihost executable for host side of RTI /usr/vived/doc ACE documentation /usr/vived/demo installed demos /usr/vived/demo/...data data files for demos /usr/vived/demo/...data/code user source code for installed demos Libvived The VIEW software engineers have developed an extensive library of user-callable functions which can be used to build a scenario. The installed version of the library is located in /usr/vived/lib. These functions are written in C for the most part, although YACC and LEX are used in parsing the graphics files. There are over 60 C files in /usr/vived/src/lib/libvived and /usr/vived/src/lib/libvived/gosamr that make up the libvived.a library. The corresponding sources for the include files are in /usr/vived/src/include and the installed include files are in /usr/vived/include. The RCS version of the source files are in the same directory as the sources. Commands There are nine utilities in /usr/vived/bin. The sources for these utilities are in /usr/vived/src/cmd. The following is a list of utilities and their use. ace edit and test of audio cues and patches boomtest test and calibrate boom glovecalib calibrate glove glovetest test and calibrate glove mkcal make calibration (cal) files qtext makes a text listing of a cue file spsyntest test the Dectalk speech synthesizer tracktest test the magnetic trackers vocatest test the VocaLink speech recognizer vpsize test resolution, contrast, and brightness of video display device Calfiles Cal files provide a way of changing i/o device parameters and characteristics (like which port its connected to, and baud rate, etc.) without having to recompile the application software. The application software specifies the cal file to use and the device initialization software reads the file and acts accordingly. If characteristics are changed, the cal file needs to be recompiled but the application programs need not be. Cal files were developed when all i/o was done on the MUXes. Cal files are not being used for the RTI i/o. The source files that specify the device characteristics are in /usr/vived/src/lib/cal. The installed cal files are in /usr/vived/include. Graphics and GOSAMR Every graphics engine has its own native graphics language. GOSAMR (Graphic Object Specification, Manipulation, and Rendering functions) is meant to be a device independent way of specifying graphics objects and their transformation. GOSAMR borrows a lot of its concepts and terminology from PHIGS (programmers hierarchical interactive graphics systems). GOSAMR allows the ASCII representation of objects and their characteristics and transformations. Objects are made of points, lines, polygons, and sub-objects. Characteristics include luminance and visibility. Transformations include scale, rotation, and translation. Characteristics and transformations can be changed in real-time. Objects, their characteristic, and their transformations, are defined in text files. These files are read by the GOSAMR software in an application and turned into graphic objects and rendered. Graphics objects are arranged in hierarchies called "structure networks" or trees. On the SRX graphics, the tree is called a display list. On the ISG, the tree is a collection of ISG objects (which behave differently than a display list). User defined labels allow the user to access the transformations and characteristics in the tree and modify them during program execution. There are three phases of drawing an object. The first is object creation when the ASCII files are read and the trees are created. The second is compilation when changed transformations or characteristics are put into the trees. The third is traversal when the tree is "walked" and the graphic objects are rendered. The first phase is executed once (usually) at the beginning of a program. Phases two and three are executed (usually) once per frame. The first phase, when the trees are created, is device independent. If the graphics machine is the SRXs, phase two modifies the tree, and creates or modifies the display lists, and phase three renders the display list. If the graphics machine is the ISG, phase two modifies the tree, and creates the ISG objects if they are not already created. The objects are both modified and rendered in phase three. There are two noteworthy features of GOSAMR; traversal control, and escapes. Traversal control can allow tree normal traversal, return back up the tree without walking any part of the tree below, or continued traversal with invisibility (i.e. do the transformations below but do not show the graphics until visibility is again turned on). Escapes allow the calling of a user-specified function from down in the graphics tree. This is useful for retrieving transformation matrices of (usually invisible) objects and then using the matrices for collision detection calculations between objects. The VIEW system provides window functions that, when passed a GOSAMR structure, will draw the graphics objects in a 3-d window, once for each eye. Window parameters such as perspective matrix, field of view, inter-pupillary distance, etc. are part of the windowing structure and can be accessed or modified by the programmer. The window functions also allow you to draw windows inside of other windows. The VIEW system also provides easy to use menu creation and interaction functions. Menu parameters such as number of columns, action upon selection, whether the menu goes away upon selection, etc. can be easily changed by the programmer. The menu software creates a GOSAMR graphics objects from the data you pass. Interfaces and Conventions VIEW software conventions and program interface conventions are evolving as the VIEW environment is developed. Several of these conventions are described below. Standard VIEW Options (SVO): Standard VIEW Options software provide an easy to use, standard interface for VIEW programs. The software is based on a data structure (svodata) that is initialized (and optionally modified) by the programmer, and may be interrogated and/or modified by the user at run-time. Options that can be set or changed are choice of graphics engine, glove calibration file, convolvotron calibration file, right and left glove devices, and head, left and right tracking devices. Speech and Sound: All speech and sound output are triggered via the event mechanism to assure that all sound outputs are handled consistently. There are currently no conventions for speech input and output. Use is defined for each application. Windows: All graphic output is rendered into an overlapping window system. The window system creates a stereo pair of images; one half the pair for each eye. Each half of the stereo pair rendered from a slightly different viewpoint. The viewpoints are separated by the inter-pupillary distance of an 'average' person. Several overlapping windows may be defined with different graphics in each. Windows may occlude other windows or may be transparent. Menus: The user may create a menu. The menu then exists in the VIEW database as a three dimensional object. The user can move this menu and select objects via VIEW library menu calls. The moving and selection are usually done via gestures. The user assigns a function to each item in the menu and when an item is selected the corresponding function is executed. Gestures: There is a standard set of gestures predefined in the VIEW environment. The data returned from each glove can be passed to a VIEW library routine. This routine returns a value indicating the gesture being made by that glove. The user can then use this value to determine what action should be taken by the application. Data Representations: INPUT Glove: The VIEW library routines that handle the glove input return, for each glove, a struct which contains the joint angles for each measured finger joint of the user's hand. The angles are measured in radians, with straight fingers returning zero. The glove struct is defined as: struct glove { struct finger { float mpj; /* joint where finger meets palm */ float pipj; /* middle joint of finger */ } finger [5] /* array of one for each finger */ float palm; /* degree of bend of palm NOT CURRENTLY SUPPORTED */ float web[4]; /* contains abduction bend data NOT CURRENTLY SUPPORTED */ } The order of the glove.finger array is THUMB, FOREFINGER, MIDDLEFINGER, RINGFINGER, LITTLEFINGER (these values are defined in glove.h). Tracker: There are two types of data received from the trackers, position and orientation. The position data is returned as a struct called vector3d, defined as: struct vector3d { float x, y, z; } The position data is in a coordinate system with 0, 0, 0 on the floor directly under the tracker source. Positive x is forward (the direction the tracker source is pointing), positive y is to the right (when facing forward), and positive z is up. The units of position measurement are meters. There are three data formats for receiving orientation from the trackers: quaternion, matrix and euler. The user chooses to receive one of these formats by using the appropriate call from the VIEW library. The orientation is defined relative to the coordinate system defined above. When the orientation data is received as a quaternion, the data is returned in a struct defined as: struct quaternion { float w, x, y, z; } An axis for rotation is defined by the origin and x, y, and z values. The angle of rotation about that axis is arccosine(2w). It is sufficient in normal use of the VIEW environment to pass the quaternion values to the animation and window routines without worrying too much about their significance. When the orientation is returned as a Matrix, the data is returned in a data type Matrix, defined as: typedef float Matrix[4][4]; The upper left 3x3 corner of this matrix contains a normalized rotation matrix, and the 4, 4 element is 1.0. All other elements are 0.0. When the orientation is returned as a set of euler angles, the data is returned as a vector3d (see above), with the x field = roll, y field = pitch, and z field = yaw. Speech Recognition: The output from the speech recognition system is returned as a pointer to an ASCII string. This string contains the last single word recognized by the speech recognition system. OUTPUT Windows: A window is defined by a struct. The various fields of the struct represent attributes of the window and they are initialized by the function call that creates the window. The user may access and modify fields of this struct, both through various calls and/or directly. Window position is defined in terms of screen coordinates. The position of a window is given by a rectangle indicating the top left corner and bottom right corner of the window on the screen. The bottom left corner of the screen is (0.0, 0.0), and the top right corner of the screen is (1.0, 1.0). Speech Synthesis: Speech output is initiated by passing a string to the speech synthesizer. There are currently no conventions for speech output. Speech output can also be accomplished using the audio cue system. Audio Cues: All audio data is defined via the user command "ace". The ace program creates a file of audio cues. Each cue is accessed by a ASCII string via the audio and event routines. Documentation There are currently about 50 locally written man pages covering various aspects of the VIEW software and hardware. Listed below are topics and a list of man pages (many of the man pages could actually be listed under several topics). glove: glovecalib(1), glovetest(1), calgl(3), gesture(3), glove(3) tracker: tracktest(1), tracker(3) audio & events: qtext(1), spsyntest(1), audio(3), event(3), qtimeout(3), spsyn(3) graphics: vpsize(1), drawgs(3), drawstr(3), gedep(3), getgs(3), getlabgse(3), ghand(3), gosamr(3), loadgfile(3), menu(3), setgseval(3), viewer(3), window(3), AFI(3), ISG(3) speech input: vocatest(1), voca(3) scenarios: tele(1), skel(1) math, matrices, quaternions: fastmath(3), fxform(3), finvrmat(3), mattotrs(3), mattoxform(3), quaternion(3), vector3d(3), matrix(3), misc(3), viewmath(3) miscellaneous i/o: getcal(3), kybd(3), svo(3), viewio(3) boom: boom(3), boomcheck(3) A hardcopy version of the locally-written man pages are available in the Man Page notebook. Some non-locally written man pages that may be of interest are: rcs(1), rtiintro(3t), and starbase (3g). Other locally written documents that may be of interest are: Virtual Interactive Environment Workstation Notebook ("white binder"), VIVED System Documentation, Virtual Visual Environment Display System "Quick Look" User's Guide, Remote Camera System Document, 1989 SIGGRAPH Tutorial Notes, Audio Cue Editor (ACE) User's Manual, VocaLink-VIEW's Voice Recognition System, and README files in /usr/vived/demo and its sub-directories. Also of interest is the video document "VIEW: The Ames Virtual Workstation". Designing a VIEW application: A typical VIEW application contains several interconnected elements. This section describes those elements and how they communicate. The relevant call for each step is given in parenthesis. See the source code for the skeleton application, or the skel man page, for a specific example. Overview: The basic VIEW application uses input from the trackers and gloves, and outputs graphics to a stereo viewer. Additional features may include speech input, speech and audio output, menus, and windows in addition to the main window. Topics not covered here include use of the convolvotron, and special (non-GOSAMR) graphics. Initialization: Initialization of the VIEW environment is still somewhat finicky. First initialize the user defaults via the svo structure (initsvo). The fields of the svo structure should be used when initializing all I/O devices. The viewio structure should then be initialized (initviewio), which sets up the RTI serial communications with various devices. The I/O devices are then individually initialized (inittracker, for example). If audio input and output is used, the audio and speech devices should be initialized (initaudio, initspsyn, and initvoca). Then the graphics windows should be initialized (initwindows) and defined (makewindow). Remember that one window creates the stereo pair for both eyes. Any other windows should be defined also at this time. The user should then load all GOSAMR graphic data (loadgfile) and initialize the graphic hand (initghand). Any menus to be used should be defined at this time (initmenus and initmenu). If the user wishes to use UNIX system signal vectors, they should be defined AFTER the windows are defined (The window code uses signals for parallel communication with the graphic computer). After initialization, the graphics should be compiled (compallgs), and all windows should be drawn once (setupwindow) before the program loop is entered. Finally, hide any windows that should not be showing at the beginning of the application (hidewindow). Program loop: The simplest VIEW application's program loop 1) collects data from the trackers, 2) processes the data, 3) draws the graphics for this frame and updates audio output (if any). 1) Collect data: First, get the data from the RTI system to the host computer (getviewio). Then collect the data for each device (gettrackerq, getglove, etc.). For the head tracker data, there is a special call (trackheadq, for example) which returns the position of the center of the head. Also collect any speech data here (getvoca). 2) Process the data: What happens here is largely up to the user. The only standard action in this phase is the animation of the graphic hand (animghand) using the current glove and glove tracker data. The user may interpret the glove data via the gesture structure (gesture). The value returned from the gesture routine may then be used to initiate some action such as picking up an object or triggering a sound event. The user may pass the gesture and hand position to the menu structure (posmenu and hitamenu) to trigger a menu action. 3) Draw graphics and update audio: First make sure all windows that the user wishes to display are visible (showwindow). Then compile any changes to the graphics (compallgs). This must be done even though the user may not have initiated any changes. Call compallgs before any drawing and DO NOT call compallgs more than once per loop. Then update all windows (setupwindow) with the graphic data for that window and the current head tracker data. Setupwindow must be called for each graphic structure in that window. Finally, cause the actual graphic screen to be updated (drawwindows). If the audio event structure is used, the audio system should be updated (maintain_score). If the convolvotron is being used, update the convolvotron's head position (cv_newhead_q). Exiting the program: First close the individual I/O devices that are open (shutglove, shuttracker, etc.). Then shut down the RTI (closeviewio) and the graphics (Rclose). Finally, if the audio structure is used, close it (shutaudio).